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We propose and demonstrate a scheme to realize a high-efficiency truly quantum random number 
generator (RNG) at room temperature (RT). Using an effective extractor with simple time bin 
encoding method, the avalanche pulses of avalanche photodiode (APD) are converted into high- 
quality random numbers (RNs) that are robust to slow varying noise such as fluctuations of pulse 
intensity and temperature. A light source is compatible but not necessary in this scheme. Therefor 
the robustness of the system is effective enhanced. The random bits generation rate of this proof- 
of-principle system is 0.69 Mbps with double APDs and 0.34 Mbps with single APD. The results 
indicate that a high-speed RNG chip based on the scheme is potentially available with an integrable 
APD array. 

PACS numbers: 


I. INTRODUCTION 

Random numbers are important in many fields of sci¬ 
entific research and real-life applications, such as funda¬ 
mental physical research, computer science and the lot¬ 
tery industry. Although pseudo RNs can be generated by 
computer software and hardware with very high speed, 
high quality truly random number generators (TRNGs) 
must be adopted in some important applications. For 
example, TRNGs play important roles in information se¬ 
curity, in which quantum cryptography is an emerging 
technology with potential applications to the next gener¬ 
ation information security infrastructure. 

The unpredictability of a physical procedure is the re¬ 
source for generating truly RNs, and two steps are typ¬ 
ically necessary to generate RNs with these procedures. 
First, signals related to a random physical procedure 
must be effectively generated and gathered. There have 
been many TRNGs based on physics, for example, cir¬ 
cuit noise [1-4] and radioactive decay [5]. In all available 
elements, quantum mechanics is good for generating a 
nondeterministic signal. Some RNGs are designed with a 
quantum procedure in nature, such as wave function col¬ 
lapse of single photon due to measurement [6-10], entan¬ 
gled state measurement [11], effects of vacuum fluctua¬ 
tion [12, 13] and quantum phase fluctuation [14]. In many 
of these schemes, an almost single photon light source is 
necessary for generating quantum signals [7, 9, 10, 15]. A 
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quantum random number generator (QRNG), in which 
the light source is not necessary but compatible may have 
advantages in integration and usage. 

The second step of a physical RNG is to implement 
an effective encoding method to transform these physical 
signals into RNs. The efficiency of encoding methods is 
a key limitation for the generation rate of RNGs. As de¬ 
vices and environments may vary in real time, postpro¬ 
cessing will be necessary to generate high-quality RNs. 
Even commercial products, like IdQuantique Quantis 
random number generator, for which Photons - light par¬ 
ticles - are sent one by one onto a semi-transparent mirror 
and detected, cannot avoid postprocessing. Algorithms 
applied to raw RNs may reduce the efficiency of the final 
RNs and increase the complexity and cost of the sys¬ 
tem. Thus, the kernel of a TRNG is an effective, encod¬ 
ing method that is immune to slowly varying noise in¬ 
terference and is no need for complex postprocessing to 
remove bias. What should be noted here is the bound¬ 
ary between encoding methods and postprocessing algo¬ 
rithms. Although the boundary is not clearly defined, 
we adopt the principle that postprocessing algorithms 
take a large amount of resources [16]. Many QRNGs can 
generate high-quality RNs by utilizing simple encoding 
methods, but efficiency is a dominant limitation for most 
ones [10, 15, 17], for example, the efficiency in reference 
[10] is 40%. Some other one achieved very high rates by 
encoding the amplitude of the probe current of the de¬ 
tector into multi-bit RNs [18]. However, the amplitude 
of the probe current is sensitive to devices and environ¬ 
ments, so stable devices with high resolution are required 
when implementing these RNGs, which indicates higher 
cost and greater complexity. 
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Bias-free physical processes are perfect for RNG so 
that the encoding method will be mostly simple and 
will use minimal resources. However, processes used in 
RNG are always biased, so that the encoding method 
plays a key role in the RNG to obtain high-quality RNs. 
John von Neumann first proposed an unbiased encod¬ 
ing method for biased Bernoulli trials [19]. It has been 
used in QRNG [17]. However, the efficiency limitation 
of the von Neumann method is 0.25. The method was 
subsequently developed in order to obtain a higher effi¬ 
ciency [20-22], among which Elias produced a very high 
efficiency for infinite situations [20]. Ren et al. pro¬ 
posed another encoding method based on the precise dis¬ 
crimination of photon numbers of two consecutive pulses 
[10]. This scheme needs high precision devices to discrimi¬ 
nate photon numbers, and the efficiency limitation of the 
method is 0.5. 

In this study, we propose a TRNG scheme based on 
the discrimination avalanche pulses of APD. These pulses 
can be generated by the dark current of APD or incident 
photons, so that a light source is compatible but not nec¬ 
essary in the scheme. A robust encoding method for bi¬ 
ased Bernoulli trials with higher efficiency than previous 
works is proposed. Furthermore, we test the system with 
multi-APDs, the experiment results indicate the feasi¬ 
bility of implementation of high-quality, robust QRNG 
chips using an integrated APD array. 


II. RNG SCHEME 

According to the quantum theory of lasers, the photon 
statistics of a laser pulse operating above threshold fol¬ 
lows the Poisson distribution [23] , which can be preserved 
after drastic attenuation. The Poisson distribution is 

\n 

P,{n) = -e-^ (1) 

n! 

where A is the mean photon number of a laser pulse and 
P\{n) is the probability that the pulse contains n pho¬ 
tons. Thus, the coherent state produces an unpredictable 
photon number for every detection, and this quantum 
property can be used to implement QRNG. 

If the detection efficiency of the APD is not taken into 
account, the probability of an avalanche pulse caused by 
a laser pulse produces a photon number n > 0 in the 
pulse, which can be described as 

Pxin) = I - Px{n = 0) = 1 - e-\ (2) 

n>0 

The avalanche pulses of an APD can be generated by 
dark currents. Because of thermal fluctuation, electrons 
of the APD may transit from the top of the valence band 
to the conduction band. Electrons in the conduction 
band are sped up by the high reverse-bias electric field 
and lead to avalanche pulses. Because the thermal fluctu¬ 
ation at RT is much smaller than the energy gap between 


the valence band and the conduction band, the transiting 
probability is very small. We use the tight-binding ap¬ 
proximation here. Thus, the transiting events of different 
atoms are independent identically distributed (HD), and 
the statistics of total events follow the Bernoulli distri¬ 
bution 

= (3) 


where ni is the total electron number transiting to the 
conduction band during a certain priod r, Ni is the total 
electron number at the top of valence band, pi is the 
transiting probability of a single electron, and P{ni = 
k) is the transiting probability of k electrons. As pi is 
much smaller than 1 and Ni is large in the material, the 
transiting process follows the Poisson limit theorem 


lim 

Ni—^oo,pi^0 



/1 \ Ni — k 

Pi(i -pi) 



PxAk), 


( 4 ) 

where Ai is the mean electron number transiting to the 
conduction band during r . Thus, the electrons’ transit¬ 
ing process follows the Poisson distribution, and so does 
the dark count. The dark count is then as usable as 
a laser pulse. Dark counts of APD were first used by 
Tawfeeq to propose a RNG scheme [24]. The scheme was 
easier and provided a new idea regarding RNG based on 
APD. However, that scheme did not show an effective en¬ 
coding method and could not generate RNs. And RNG 
with high randomness based on dark counts of APD has 
not been implemented before. 

Because the sum of Poisson-distributed random vari¬ 
ables follows the Poisson distribution, the sum probabil¬ 
ity of an avalanche during t' follows 


p = Py(n>0) = l-e-^', (5) 


where r is the detection window, A = r]{X -I- Ai) , and 
r] is the detection efficiency of APD. Clearly, the prob¬ 
ability of no avalanche pulse is g = 1 — p = e~^ . The 
detection process is then a Bernoulli trial. A simple and 
robust encoding method for biased Bernoulli trials is then 
proposed here. It is an extension of the von Neumann 
method but with much higher efficiency. The encoding 
method is constructed as follows. 

We consider the physical system of an avalanche photo¬ 
diode (APD) working on the Geiger mode. We treat a de¬ 
tection window of APD as a time bin and sequence these 
time bins with time. According to the discussion above, 
avalanches caused by laser and thermal fluctuation in dif¬ 
ferent time bins are HD if experimental parameters are 
constant, namely, p = po. In fact, experimental param¬ 
eters are hardly constant. We claim that our encoding 
method constructed here also applies to the condition 
that experimental parameters vary slowly so that the en¬ 
coding method is effective and robust. We mark a ”1” in 
a time bin if an avalanche happens in the corresponding 
detection window; otherwise we mark a ”0” in it. Con¬ 
sidering N time bins happened successively as a time-bin 
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block. There are totally possible combinations when 
k ” 1” are marked in the block if we do not get additional 
information about the block, namely, the uncertainty of 
these N time bins are ('^). These equiprobable (^) pos¬ 
sible combinations are then encoded into uniform RNs 
from 0 to ('^) — 1. The encoding processes are one-to- 
one mapping and the mapping function is 
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where, kj means that the j-th ” 1” happened in the kj-th 
time bin. 

Then, we go to the interpretation of the mapping func¬ 
tion. As discussed, if we only know that there are k ” 1” 
in the time-bin block, the uncertainty is ('^). If we get 
the temporal information in the time bin suquence of the 
first ” 1”, namely, fei is known, the uncertainty reduces to 
in other words, the information content we get 
is ('^) — similar, when k 2 is also known, the 

information content we get increases by — (^Z^^) ■ 

The uncertainty will reduce further if k^,ki,--- are also 
known. In the extreme case, if fci, ^ 2 , • • • , fc/c are all know, 
the uncertainty remaining becomes zero, and we get all 
information content about these (^) combinations. We 
sum all information content got with fci to kk and the 
summation is the RN we want. The mapping func¬ 
tion is - - Ck~-/) = 

(fe_j+Z) I where we have used the combination for¬ 
mula (^+1) = (JJ + (^). It is evident that 
is monotonic with kj. Thus, the mapping function is 
monotonic. The maximum possible number got from 
mapping function is /(fci = l,fc 2 = 2, • • • , = k) = 

ik-J+i) = (^) “ 1- minimum possible 

number is /(fci = A^—fc-|-l,fc 2 = N — k + 2, ■■ ■ ,kk = 
= J2j=i = 0- Thus, the encoding pro¬ 

cess is one-to-one mapping and the RN is in repre¬ 
sentation. 

Taking into account the wide applications of binary 
RNs, the (^)-ary encoding method can go further and 
be modified by the binary method proposed by Elias [20] . 
The method expands (^) into subblocks as follows: 


Encoding 

process 


f(ki.k2.ki,)=2 ^ 10 


f(ki,k2,... ,kk)=5 
f (ki,k2.kk)=l -1 


FIG. 1: (Color online) The schematic graph of encoding pro¬ 
cess in time sequence, where A = 4. For the first N Time bins 
(detection windows). A: = 1, (^) =4 = 2^, /(fci, fc 2 , • • • , kk) < 
2^ always holds and f{ki,k 2 ,--- ,kk) = 3 converts into a 
2-bit number ”10” directly. For the second N Time bins, 
fc = 2, (^) = 6 = 2^ -K 21,2^ < /(fci, fe, ■■■ ,kk) = 5 < 
2^ + 2\f'{ku fc 2 , • • • , kk) = 5-2^ = 1, thus, f'iki, k 2 ,--- ,kk) 
converts into a 1-bit number ”1” directly. 


(2™ -I- into a A+i-bit number directly (the 

schematic graph of encoding process is shown in Figure 
1). /(fci, ^ 2 , • • • , kk) is abandoned if v+i = 0. 

The encoding method constructed requires p to be con¬ 
stant among these N time bins (detection windows) of 
the same block. But the p values in different blocks are 
not necessarily identical, so that the method is robust 
to environment noise. As detection interval of APD can 
be as less as ^ ns, it is only ^ ps when N ^ 100. It 
is reasonable to consider that parameters, depending on 
environments, which are slowly varying, are invariable in 
such a short time. These parameters can be laser in¬ 
tensity, temperature, etc. In additional, the raw RNs 
generated remain uniform even the slowly varying inter¬ 
ferences are periodic (see Section III and Section IV). 
Thus, the encoding method is effective, efficiency and ro¬ 
bust in practice. 

The encoding method is effective for any k ^ 0,N. 
Taking into account all possible value of fc, the average 
encoding efficiency per time bin before binary expansion 
is 


-b Q;m-l2™ ^ -b • • • -b ao2°. (7) 


N-1 








i^k) 


( 8 ) 


So that ■ ■ ■ ,C(o are binary expansion coeffi¬ 
cients of integer where am = l,ai = 0 or 1 

for 0 < f < m. The subblock related to the oq 
term should be abandoned, as it contains either one 
or no member and could not be encoded into RNs. 
Suppose the non-zero binary expansion coefficients are 
If f{ki,k 2 ,--- ,fcfc) < 2™, convert 
f{ki, fc 2 , • • ■ , kk) into a m-bit binary number directly. If 
+ < fiki,k2r-- ,kk) < 2-+ 2*% 

then convert /'(^i, ^ 2 , • •' , = /(^i, ^ 2 , • • • , ^fc) - 


A higher H{N,p) indicates a higher efficiency. For any 
integer N > 2, the optimal p for the average encoding 
efficiency H{N,p) is i, and H{N,p) —>• ^(p) as iV —>• oo 
(see Figure 2), where S{p) is the Shannon entropy of 
a single Bernoulli trial. In addition, H{N,^) increases 
with N and converges to 1 (the projection in circular 
blue curve of Figure 2). For instance, H{5, i) = 0.5604, 
while i7(10, i) = 0.7294, the efficiency is much higher 
than previous ones based on single photon discrimination 
[10, 15, 17]. The theorems are proven in the Appendix. 
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FIG. 2: (Color online) The average encoding efficiency per 
time bin increases with N. H{N,p) (the 3-Dimensional blue 
curve) converges to S{p) (the dotted red curve) with infinite 
N. The projection on the left side are the efficiencies of Nk- 
ary (the circular blue curve) and binary (the triangular pink 
curve) encoding methods for different N when p = |. The 
projection shows that the encoding efficiency will converge to 
1 with infinite N when p = i . The corresponding efficiency 
after the binary expansion is lower {N > 2) but will converge 
to the Nk-axy one at large N. 

The more subblocks ('^) divides into, the fewer pos¬ 
sible combinations and thus the less uncertainty of the 
subblock there will be. In addition, the uncertainties 
among different subblocks (blocks) are not utilized in 
both encoding methods above. Thus, more subblocks 
indicate more uncertainty among subblocks and hence 
less extracted entropy and encoding efficiency, as the to¬ 
tal entropy is conserved. The output sequences in binary 
representation are therefore obtained at the cost of en¬ 
tropy or efficiency, and the efficiency becomes 

N-l 

fe=i (9) 

-|- • • • -b ao^,2^'‘ ■ 0), 

where the subscript k means there are k ” 1” in the block. 
The efficiency after expansion is shown in Figure 2 (the 
projection in triangular pink curve). Moreover, more 
blocks mean more resources to be required. 

Afterpulsing effect will lead to bias of IID events above. 
Its influence on QRNG will be discussed in Section IV. It 
should be note that a similar spatial encoding method 
has been proposed for a different physical system by 
Marangon et.al.[25]. 

III. EXPERIMENTAL SETUP 

Three scenarios were designed in order to evaluate 
the feasibility and the robustness of this scheme, (a) 
Avalanche singles from a single APD was acquired and 
data were encoded according to the method of Section II 



FIG. 3: (Color online) Schematic setup of the experi¬ 
ment. LD: Laser diode; IM: optical intensity modulator; BS: 
beam splitter; Att: electronically variable optical attenua¬ 
tors (EVOA); OF: optical fiber; APD: avalanche photodiode; 
TDG: time-to-digital converter; PC: personal computer 


in order to verify if the method can generate high-quality 
raw RNs. A laser diode (LD) was added in this setup as 
an optional light source to increase the RN generation 
rate. By modulating the power of LD, we simulated an 
additional noise and the variation of the avalanche effi¬ 
ciency, so that the robustness of scheme was evaluated, 
(b) We added an additional APD to the system of setup 
(a). Two APDs were grouped, and the avalanche pulses 
were gathered and processed parallelly to generate RNs. 
This setup was to evaluate the possibility to increase the 
RN generation rate with APD arrays while keeping the 
high-quality feature of RNs. (c) In order to demonstrate 
the scheme can work properly without light source, the 
LD of setup (b) was removed and the avalanche pulsed 
were generated only by dark counts of APDs. 

The system diagram is shown in Figure 3. A pulse LD 
with the wavelength of 1550 nm was used as an optional 
light source and was trigged by 1 MHz electronic pulses. 
An intensity modulator (IM) following the LD was used 
to modulate the power of light pulses from LD. The out¬ 
put light pulses from IM were divided into two parts by 
a beam splitter (BS) and were attenuated to the single 
photon level by two electronically variable optical atten¬ 
uators (EVOA). Then light pulses of different paths were 
coupled to two APDs (PGA-300, Princeton Lightwave), 
individually. The APDs worked in Geiger mode. The 
trigger frequency was 1 MHz and the gate width was 2.5 
ns. 

APDs used to detect single photons are commonly 
cooled from -30 °C to -50 °C in order to reduce the dark 
count rate, such as in quantum key distribution appli¬ 
cations. In our experiments, dark count of APDs can 
be used as a resource to generate RNs as well as exter¬ 
nal photons. Thus, the cooling processes for APDs are 
not necessary, which makes the system more practical 
and less expensive. The APDs in our experiments were 
worked at RT (approximately 23 °C). 

The avalanche pulses of the two APDs were discrimi¬ 
nated and amplified, then were sent into a time-to-digital 
converter (Agilent U1051A Acqiris TC890) to be pro¬ 
cessed. The TDG has one input channel of start signal 
and 6 input channels of stop signals, and can convert the 
time intervals between the stop and start signals into 32- 
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bit numbers. The discrimination results from TDC were 
sequentially numbered with the time bin of 1 ^s, and 
consequently encoded into RNs according to the method 
of Section II in real time. As a proof-of-principle ex¬ 
periment, the encoding process was executed every four 
successive detection windows. All trigger signals in the 
system were synchronized by a home-made circuit and 
the delay among them could be adjusted in the step of 
10 ps, respectively. 

The system could be divided into three parts, as shown 
in Figure 3. In Scenario (a), Part 2 was removed, thus 
single APD was used to generate RNs consequently. In 
Scenario (c). Part 1 was removed, and the RNs were ex¬ 
clusively generated by dark counts of APDs. We added 
a sinusoidal driving signal to the IM in Scenario (a) and 
Scenario (b) with frequency of 0.05 Hz and amplitude of 
3 V in order to simulate an external noise. The average 
counting probabilities of the two APDs were initialized 
to 0.5 by adjusting the EVOAs ahead of them individu¬ 
ally. According to the sinusoidal modulation of IM, the 
probabilities of the APDs varied from 0.3 to 0.7, which 
could be regarded as an external noise to APDs. 


IV. RESULTS AND DISCUSSION 

In our experiments, we set V as 4 as discussed. The 
total 16 types of detection results were classified as 5 
subsets according to the k value. Two of these six¬ 
teen types of detection results, subsets with fc = 0 
and fc = 4, were abandoned, while the other fourteen 
were set to generate RNs. The generation rates of 
Scenario (a) and Scenario (b) are functions of time t. 
The encoding efficiency after binary expansion of Sce¬ 
nario (a) is HbiN,p{t)) = ^ Hb{N,p{t)) dt, where 
p{t) = 0.5 + 0.2sm(0.l7rt). Substituting N with 4, we 
obtain 77(,(4,p(t)) = 0.3454, and the corresponding gen¬ 
eration rates is about 0.34 Mbps. Scenario (b) has a dou¬ 
ble generation rate of about 0.69 Mbps. As dark count 
of APDs used in Scenario (c) is relatively low, the encod¬ 
ing efficiency per APD is i7h(4,0.01) = 0.0197 and the 
practical generation rate is about 0.04 Mbps. 

With current technologies, the generation rate of 
QRNG can be 100 Mbps to Gbps with the cost of utiliz¬ 
ing stable and high resolution equipments [13, 14, 27-29]. 
Although the generation rate of our proof-in-principle ex¬ 
periment is relatively much lower comparing with exist¬ 
ing results, it can be remarkably increased with some 
measures. Firstly, using the APD array with high inte¬ 
gration density can generate random bits concurrently 
with an acceptable cost growth, as has been demon¬ 
strated in our experiment. Secondly, the higher gener¬ 
ation rate can be acquired with higher gating frequency 
of APD which can work exceed 2GHz [27, 28]. Thirdly, 
benefiting from the simple encoding method of the sys¬ 
tem, larger N can be employed to improve the encoding 
efficiency while the random generation rate will not be 
restricted by the processing procedure of the raw key bits. 



Cooling temperature (-50 C) 
Room temperature (23 C) 

■- avalanche (c) 


0 10 20 30 40 50 60 70 80 90 100 

Gate number after primary avalanche ignition 


FIG. 4: (Color online) (a) (b) Afterpulsing probability per 
gate (the gate frequency is 10 MHz) at room temperature (in 
red circle) and cooling temperature (in blue dot), (c) The 
schematic time sequence diagram of encoding process, where 
A = 4 and fc = 1. 


Afterpulsing is correlated to the primary avalanche 
[26]. Afterpulsing effects lead to bias of RNs generated. 
We compared afterpulsing probabilities of APD at cool¬ 
ing temperature (CT, —50°) and RT, as shown in Figure 
4. The data were measured under Geiger mode with 10 
MHz gating frequency. The total afterpulsing probabil¬ 
ity at RT is 3.3%, while it is 1.8% at CT. According to 
Figure 4(a) (b), afterpulsing probability at RT is much 
larger for the top gates, but it decreases rapidly to zero 
and becomes smaller than that at RT after the eighth 
gate. 

Let Paii) be the afterpulsing probability of the i-th 
gate after primary avalanche ignition. The probabilities 
of original HD Bernoulli trials are not equal any more 
and extracted entropy becomes less. For the case of fc = 
1, showing in Figure 4 (c), the probabilities of different 
events become 

Pfc=l(l) =p{l - P-Pa(l))(l - P-Pa(2))(l - P-Pa(3)) 
Pfe=l(2) = (1 - p)p{l - P - Pa{l)){l -P-Pa{2))\ 
Pfc=l(3) = (1 - p){l - p)p{l - P - Pa{l)); 

Pfc=i(4) = {1 - p){l - p){l - p)p. 

(10) 

where Pk=i (*) represents the probability that the 
avalanche happens in the *-th time bin. For experiments 
here (the gating frequency is 1 MHz), Pa(l) = 4.3 x 
l0-\pai2) = p,(3) = 0,Pfc=i(l) = 0.062446, Pfc=i(2) = 
0.062446, Pfc=i(3) = 0.062446, Pfc=i(4) = 0.0625, where 
p = 0.5 is adopted. The extracted entropy becomes 
S = Pk=i{i) log 2 Pk=i{i) = 2 — 10“^. The reduction 
of extracted entropy is negligible. When gating frequency 
is 10 MHz, the afterpulsing effect is more but not that 
signihcant. The corresponding extracted entropy reduces 
as less as 2.8 x 10“^ but the corresponding generation 
rate will be 6.9 Mbps. For higher gating frequency and 
count rate, the afterpulsing probability is not a catas¬ 
trophic problem. 2 GHz gating frequency and count rate 
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FIG. 5: (Color online) The uniformities of RNs output from 
the RNG scheme. For every four detection windows, x or 
y = 0 indicates that no avalanche pulse is detected, x oi y £ 
[1, 4] indicates that one avalanche pulse is detected, x ot y £ 
[5,10] indicates that two avalanche pulses are detected, x or 
y £ [11, 14] indicates that three avalanche pulses are detected 
and a: or y = 15 means four avalanche pulses are detected, 
where x, y £ integer. The altitude represents the population 
of detection results in which x and y happen successively. 

as high as 650 Mcount/s InGaAs APDs have been real¬ 
ized, respectively [27, 28]. The afterpulsing probabilities, 
4% for reference [27] and 1.5% for reference [28], are still 
within the same level. For multi-avalanche case of high 
speed APD, the analysis of afterpulsing effect is much 
more complex and requires further study. 

We analyzed the uniformity of the RNs generated 
from these experiments to test the independence of dif¬ 
ferent detection events. Only the uniformity of detec¬ 
tion results of Scenario (a) are demonstrated, as shown 
in Figure 5, because the other two experiments contain 
very similar properties. Let p(a;, y) represent the pop¬ 
ulation of elements of a 16 x 16 matrix, which indicates 
that neighboring detection results x and y happen succes¬ 
sively, and integers x, y£0, 1, 2,-'-, 15 represent the 
16 possible detection results for every four consecutive 
detections. 

In Figure 5, different colors indicate different popu¬ 
lation p{x,y). Clearly, the matrix is symmetric and is 
partioned into subblocks. The symmetric matrix shows 
that p{x, y) = p{y,x) for any x, y. It means that there is 
no time correlation between successive x and y. p{x, y) 
of different elements in the same subblock are identical. 
It means the population is uniform in subblocks. Thus 
p{x,y) = p{y,x) = p(x)p{y). This represents that each 
detection event is independent, and the imperfections of 
the beam splitter and detection efficiencies make no dif¬ 
ference to the uniformity of the randomness extraction 
system. The experimental results are consistent with the 
theory we proposed in Section II. 

Min-entropy evaluation was employed for all these 
experiments.Min-entropy, defined as 

Hoo = -log2{max p{xi)}, (II) 

is the evaluation of the worst situation, where p{xi) is 


(a) (b) 




(c) jo”^ (b) 


02468 10 02468 10 

Bit length d Bit length d 

FIG. 6: (Color online) (a) (b) (c) Min-entropy of samples 
(data point) output from scenarios (a), (b) and (c), respec¬ 
tively. The linear fitting function (fitting line) is shown, (d) 
The relative deviations between min-entropy and Shannon 
entropy of uniform distribution of all three scenarios. 

the probability of possible output Xi, and max p{xi) 
is the maximal value of all p{xi). It is a strong way 
to measure the information content, while Shannon en¬ 
tropy is a weighted average evaluation. Both min-entropy 
and Shannon entropy are special cases of Renyi entropy 
[30, 31]. Shannon entropy is the upper bound of min- 
entropy, and they coincide if and only if the distribution 
of the variable is uniform [32, 33]. Min-entropy evalua¬ 
tion is therefore a good way to evaluate the quality of 
randomness of RNs. 

The min-entropy of raw data from all scenar¬ 
ios were evaluated, where i = 0,l,---,d — 1 and 
p{xo),p{xi), ■ ■ ■ ,p{xd — 1) represent the probabilities of 
”0”, ”1”,- • •, ”d-I” in binary representation respectively. 
As shown in Figure 6, the results show that deviations 
between min-entropy and Shannon entropy of uniform 
distribution are all of the order of 0.001. This is within 
the statistical error (^ 3.2x10“^, the red dot line in Fig¬ 
ure 6(d)), as the statistical amount is 10® • 2‘^ bits, which 
indicates a good quality of randomness of the RNs. 

The raw binary RN samples output by using the binary 
encoding method were tested using NIST statistical 
test suit [34]. The standard statistical test suite, con¬ 
taining 15 subtests, calls for a 1-Gbit sample. The stan¬ 
dard statistical test outputs two values, the p-value and 
the pass proportion, for each item. The sample passes 
the standard statistical test if and only if the p values 
and proportions of all items are larger than 0.0001 and 
0.9805608, respectively. 20 samples for both Scenario 
(a) and Scenario (b) and 7 samples for Scenario (c) were 
tested by the standard statistical test. The testing results 
are shown in Table I. 2 samples of Scenario (a), 4 sam¬ 
ples of Scenario (b) and 1 sample of Scenario (c) failed 
the statistical test. All failed samples only failed the 
NonOverlappingTemplate test. The focus of NonOver- 
lappingTemplate test is the number of occurrences of 
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pre-specified target strings. The purpose of this test is to 
detect generators that produce too many occurrences of a 
given non-periodic (aperiodic) pattern. The test outputs 
148 group of p-values and pass proportions with differ¬ 
ent pre-specified 9-bit target strings. The sequence has 
irregular occurrences of the possible template patterns 
and fails the test if the p-value is smaller than the pre¬ 
set value. Only one of the 148 groups failed the preset 
value of pass proportion, 0.980, for every failed sample. 
The minimum value of pass proportions is 0.975, which 
is very close to 0.980. We cannot find out the reason of 
failures and suspect that afterpulsing effect is a potential 
candidate. 

It is worth noting that samples to be tested were ex¬ 
tracted continuously by days. Thus, interferences that 
may affect the experiments were more complex. In all 
scenarios, no special measures were adopted to reduce 
the interference of background light noise and tempera¬ 
ture fluctuation. Despite a few failed tests, the results of 
the tests and analysis above indicate good quality of ran¬ 
domness of raw data from all of these three scenarios and 
postprocessing is not necessary. The results of Scenario 
(a) were consistent with the theory analysis and show 
the robust of our RNG from slowly varying interferences. 
Scenario (b) and Scenario (c), with double APD, sug¬ 
gest an APD array scheme, which is promising to break 
through the generation rate limitation. Scenario (c) also 
gave a relatively strict proof that dark count of APD is 
usable for RNG. The light source for this scheme, to be 
or not to be, is not a question any more. 

It should be mentioned that although the RNG can 
generate high quality raw random bits, additional post¬ 
processing methods [35] still can be used to further im¬ 


prove the quality of the final output. The generic frame¬ 
work of randomness evaluating method and postprocess¬ 
ing algorithms proposed by reference [35] provides an in¬ 
structive guideline for design of the random signal ex¬ 
tractors to achieve a tradeoff between the quality of ran¬ 
domness and the cost. 


V. CONCLUSION 

In conclusion, we have proposed and realized a robust 
and high-efficiency TRNG scheme. Dark counts of APD 
can be used as a resource in this scheme, so that a deep 
cooling process is not necessary and the system can work 
at RT. The fluctuation of pulse intensity arriving at the 
APD and slowly varying interferences affect only the ef¬ 
ficiency of the RNG rather than the randomness of the 
RN series, so the scheme is compatible with light source 
and background photons. The experimental results also 
indicate the feasibility of integrating an APD array into a 
RNG chip, which can effectively increase the generation 
rate of RNs even uo to Gbps and make the scheme have 
more practical value. 
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Lemma 1. Define a function f with expression below 
fiN, k,p) = / (I - - Pf 
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Then, if N > 2 andp G (0, ^)U{^,1), the equation about 
k 

f{N,k,p) = f{N,k,^) (12) 

has two and only two roots in [0,7V]. 

Proof. First, we simplify equation (12) below 

(1 = 2 • (i)^. (13) 

I — p p 2 

The left of equation is symmetric about p = so we only 
need to consider the case p G (|, 1 ), and then > 1 . 
Let a = (p/(l — p))^■ Then for any k G [0,iV], we have 
a G [Ij (t^)^]- As k and a are one to one, we simplify 
equation (13) and obtain 

(l-p)'’a + p''l = 2T. 

Define the function 

ip(a) = {l-p)^ a+p^ - - 2-^. 

a 2 

as p € ( 5 , 1 ), we have 

^(1) = ^[(-^)^)] = (1 -p)^ - 2 . i > 0. 

I — p 2 

Next,the derivative of p is 

p» = (i-p)^)-p^4- 

Let p'{ao) = 0, ao € (1, we obtain ao = 

Furthermore, if a G [1, then 

<f'{a) < 0;if a G ((j^)^], then ip'{a) > 0. 
In addition, ^[( 1 ^)"^)] = 2 y/p^ ( 1 -p)^) - 2 • i < 0 , 
so the equation (p{a) = 0 has two roots in ( 1 , 

As k and a are one to one, it is easy to prove that equa¬ 
tion (12) has two and only two roots in (0,N), which we 
denote as xi and X 2 , xi < X 2 , then if fe G [ 0 ,xi)U(x 2 , A^j, 
f{N,k,p) > f{N,k,\)- iffc G (xi,X 2 ), f{N,k,p) < 
f{N,k, i). In addition, as f{N,k,p) = f{N,N — k,p), 
we have xi + X 2 = N. 

This completes the proof of Lemma. 

□ 

Theorem 1. For any integer N > 2, the optimal 
p for normalized extracted entropy H{N,p) is and 
H[N,p) S{p) as N ^ 00 , where S(jp) is the Shan¬ 
non entropy of a single Bernoulli trial, and H(N,p) is 
defined as 

N-l 

H{N,p) = --J2Nk p^ {l-pf-’^ {log2j^), 

^ k=l ^ 

and Nk = (^) is the binomial coefficient. 


Proof. Let 

/(iV,fc,p) =^^1-p)^-'=+p'^-'=(l-P)^ 

then 

1 1 
H{N,p) = ^ TV, / (1 {log2 — ) 

^ k=l ^ 

1 ^ 

= nJ2^^p'' iiog2Nt) 

k—0 

1 ( 14 ) 

+ p^-^ (l-p)'=]TVfe(/ 052 TVfe) 

1 ^ 

k^O 

It is clear that when p e (0, U (^, 1), 

N N 

/(V k,p) TVfc = ^ /(TV, fc, -) TVfe = 2, 

k^O k^O 

and/(TV,0,p) = (l-p)^+p^>2-(i)^ = /(TV, 0, i). 
Thus, there exists an integer ko G (0,TV) such that 
f{N,ko,p) < f{N,ko,^). Additionally, from the proof 
of Lemma 1 we know there exist a;i,a :2 G (0, TV), 
xi < X 2 and xi -I- a ;2 = TV such that /(TV,xi,p) — 
/(TV,xi,l) = f{N,X 2 ,p) - f{N,X 2 ,\) = 0, and if and 
only if fc G (a;i,X 2 ), f{N,k,p) < f{N,k, 1); thus, we have 
ko G (xi,X 2 ). In addition, if a;i,a ;2 are integers, we have 
ko G [a;i-|-l,X 2 — l]; otherwise fco G [[xij-l-l, [X 2 J], where 
[xij represents the largest integer that is not larger than 

Xi. 

With the conclusion above, we will show when TV > 2 
and p G (0, 4) U (4, 1), we have P[{N,p) < H{N, 1/2). 
There are two cases. 

Case 1: Xi,X 2 are not integers, then 

DiJ ^ 

2^[/(TV,fc,p)-/(TV,fc,-)]TVfc 

k—0 

DiJ N .. 

= (E+ E )\f(.N,k,p)-f{N,k,-)]N, 

k—0 fc—[a:2j+l 
D 2 J 

= [2- ^ /(TV,fc,p) TVfe] 

fc=[a:ij+l 

D 2 I 

-[2- ^ /(TV,fc,p) TVfe] 

fe=[a:ij-|-l 

D 2 J .. 

= ^ [/(TV,fc,-)-/(TV,fe,p)]TVfe. 
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and 


H{N,p)-HiN,-) 


N 


= Y^[f(N,K,P)-fiN,k,l)]^log2Nk 


fe=0 

La^iJ N 


= (E + E ^)] ^ 

A:—0 fc—[ai2j+l 

1 N 

+ E ^ ^(^92Nk 


fe=[xij+l 
1^1 \ 


= 2Y,[f{N,k,p) - f{N,k,l)] ^ log^Nk 


fc=0 


L^2j 


+ ^ lf(N,k,p)-f{N,k,^)]^log2Nk 

k=lxi\+l 

1^1 \ 


<2^[/(iV,fc,p)-/(iV,fc i)] ^ 


A;=0 


L^2j 


+ ^ [/(iV, fc,p) - /(iV, fc, i)] ^ Zoff2iVL-,j+i 


fe=[xij+l 

[ 2 : 2 ] 


= ^ [/(iV,fc,p)-/(iV,fc i)] 


fc=[xlj+l 

• [Zog2iV|^2;ij+i - /og2iVLxij] 

< 0 . 


2Af 


The last inequality holds, because there exists integer 
^0 e [bij + l, b2j],and [xij+l < [X2J = N-l-[xi}, 
thus, [xij + 1 < f, so iV[xij+i > 

Case 2: xi,X 2 are integers, then analogously, 

H{N,p)-H{N,^) 

= 2|^[/(iV,fc,p)-/(iV,fc,i)] ^ log^N, 

k^O 

fc=a;i + l 

< 2 |^[/(iV, fc,p) -/(iV, fc, i)] ^ Zog2A^,, 

fc^O 

1 /V 

+ ^ [/(7V,fc,p)_/(iV,fc,-)] ^ log2iV,,+i 

fc=aii + l 

1 /V 

= ^ [/(iV,fc,p)_/(iV,fc,-)].^ 

k—xi-\-l 

■ [log2N^i+i - log2Nx-,] 

< 0 . 


The last inequality holds, because there exists integer 


ko G [xi + 1,X2 — 1], and xi + 1 < a ;2 — 1 = -/V — Si — 1, 
thus xi + 1 < ^, so -/Va:i+i > We have therefore 

proved that the optimal p for normalized extracted en¬ 
tropy H{N, p) is ^, and we will next show the remaining 
part. 

First, as 


27V 



Nk. 


we have < 1 for 0 < fc < iV. Assuming that 

0 < p < i, the cases that ^ < p < 1 and P = | are 
similar, so there exists d > 0 sufficiently small such that 
p -|- d < i, by the weak law for a binomial distribution. 


lim Nk p^ (1 — p)^ ^ = 1. 

N^OO 

p-(5< ^ <p-|-i5 

Thus, given any e > 0, there is an AA, such that for N > 
AA, 


y] TVfep'^ (l-p)^-'=<e. (15) 

l^-p|>-5 


Thus, together with < 1^ we have 

I^-Pl<<5 


<H{N,p) 


< E 

Iy-Pl<<5 


Nk p'" (1 -p) 


N-k ^092Nk 

N 


+ e . 


Because p -I- d < i, when | ^ — p |< S, we have 

/og2A^[Ar(p-5)J-l < log2Nk < ?052AA[Ar(p-|-5)J+ 1, SO 
Nkp'^ (l_p)7V-fc ^^g2jVWp-6)l-l 

lw-p|<<5 

<H{N,p) 

< Nkp'^ ^^_p^N-k iog2N^m,^s)i+i 

\^-p\<S 


Together with equation (15), we get 

_ log2NiN{p-S)i-i 

'' > N 

< H{N,p) 

^ log2NiN{p+s)\+i 
N 

Using Stirling’s formula on both side 


(1 - e)5(p - (5) < lim H(N,p) <S(p + S) + e . 

AT —>-00 

As e and S are arbitrary, with the continuity of S{p), we 
obtain limjv->.oo H{N,p) = S{p). 

This completes the proof of Theorem 1. □ 
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Theorem 2. H[N,^) increases to 1 as N approaehes 
infinity. 

Proof. With the conclusion of Theorem 1, we have 
i) converging to 1 as N approaches infinity. It 
therefore remains to be proven that H{N,^) is an in¬ 
creasing function. 

First, we have 


N 


2> = (JV + l).2"ti '■>»(« + 


N 


N + 1 


{N- 


1) . 2^f+l ^ 

^ k^l 


N 


+ ^og2{Nj^_i —-—)] 




N 


(A^ + 1) -2^+1 


N 








-f ^ iVfc log2{Nk ^^^ )] 


N 


N + 2^ (A^-H l)2^+i 


N 


■ E 


{N + lf 


k—0 


{N + l-k){k + l)' ■ 


Thus, -I- 1, i) > H{N, i) is equivalent to 


fc(fc-i-i) 


P±ij(fe+Tll[Sij) ''fe+l ' 


-S' 


M^Mk- 


1), it remains to show 


k{k + l) 

L^J(fe + l-L^J) 


. 2k 
'k + 1 


Sk-l)<i 


+ 4 xfc + l 

k + 3’ 


(17) 


We prove the inequality (17) by two cases. 

Case 1 : k is even, then = f, and inequality (17) 

is equivalent to 


fc(fc + 1) 2k ^ 2k -I- 4 y._)_^ 
|(fc-f 1-f)^A:-tl'^ -^k + s’ 

fe + 1 / fe Nfc-l ^ + l 

k + 2^k + l' -h + s’ 

k{k + 3) ^ (fc -I- 2)^ 

^(fc + l)(fc + 2)^ - (fc + l)(A: + 3)2 

k'^ fc3 ^ {k 2)^ 
^k^+3k + 2’ - {k + l){k + 3)^' 


(18) 


Since 


k^ + k3 

^ k^ + 3k k^ + 3k + 1 k^ + Ak — 2 

- k'^ + 3k + 2 ' k^ + 3k + 3 k"^ + Ak 

(P+ 3k){k+3k + l) 

~ {k'^ + Ak - l){k‘^ + Ak) 

(fc -I- 3)(fc+3fc -I-1) 

^ (fc-f 4)(fc2-h4fc- 1)’ 


2Ar-HiE^'= ^°^'2(7V-Hl-fc)(fc-Hl)] 

1 1 " 

N 

— ^ E ^og2Nk 

k=0 

^ , iN + 1)2^ 

log2 + 1 + l)]A^(iV,)2 ^ ■ 


As {N -I-1 — fc)(fc -I-1) < (y + 1)^ S'Hd A^fc < A^i^wj, it is 
sufficient to prove 


2N + 2 
^ N + 2 


)^>N, 




(16) 


Next, we prove inequality (16) by induction. When 
^ = 1, = I > 1 = When iV = 2, 

(^^)^ = I > 2 = Assuming that when N = 

fc — 1, fc > 2, we have (■j^)*'fc — 1) > (fc — l)[*i“ij■ Then, 
when = fc-hl, wehave (fc-fl)|^i^j = -[x+Tj^^rf^+Tiy' 
(fc —l)|^fc^j. By the induction hypothesis,(fc-I-1) i^fc+ij < 


it remains to show 

(fc-f 3)(fc+3fc-f 1) ^ (fc + 2)3 
(fc -f 4)(fc2 + Ak-1)- {k + l)(fc -f 3)2 
^ (fc-f3)3(fc-f l)(fc^-f 3fc-f 1) 

< (fc-h2)3(fc-h4)(fc2 -h4fc- 1) 

(fc4 -h lOfc^ -f 36fc^ -f 54fc -t 27)(fc^ + 3fc -f 1) 

< (fc^ -h lOfc^ -f 36fc^ -f 56fc -f 32)(fc2 + Ak- 1). 


Since fc > 2, we have fc^ -I- 3fc -I- 1 < fc^ -|- 4fc — 1, so the 
inequality above is hold. 

Case 2: fc is odd, then and inequality (17) 

is equivalent to 


fc(fc-H) 2fc ^ 

(^)2 ^k + l’ 


fc ,j. ,fc 
< ( 


fc -I- 3 


fc-i-i 


+ 4 xfc+l 

fc-f 3 


Since ^ it is sufficient to prove ^ < 

Notice that the inequality above is the same 
as inequality (18), so the next proof is the same in Case 

1 . 

This completes the proof of Theorem 2. 

□ 
















































